204 research outputs found

    YOLO-BEV: Generating Bird's-Eye View in the Same Way as 2D Object Detection

    Full text link
    Vehicle perception systems strive to achieve comprehensive and rapid visual interpretation of their surroundings for improved safety and navigation. We introduce YOLO-BEV, an efficient framework that harnesses a unique surrounding cameras setup to generate a 2D bird's-eye view of the vehicular environment. By strategically positioning eight cameras, each at a 45-degree interval, our system captures and integrates imagery into a coherent 3x3 grid format, leaving the center blank, providing an enriched spatial representation that facilitates efficient processing. In our approach, we employ YOLO's detection mechanism, favoring its inherent advantages of swift response and compact model structure. Instead of leveraging the conventional YOLO detection head, we augment it with a custom-designed detection head, translating the panoramically captured data into a unified bird's-eye view map of ego car. Preliminary results validate the feasibility of YOLO-BEV in real-time vehicular perception tasks. With its streamlined architecture and potential for rapid deployment due to minimized parameters, YOLO-BEV poses as a promising tool that may reshape future perspectives in autonomous driving systems

    Federated Learning Framework Coping with Hierarchical Heterogeneity in Cooperative ITS

    Full text link
    In this paper, we introduce a federated learning framework coping with Hierarchical Heterogeneity (H2-Fed), which can notably enhance the conventional pre-trained deep learning model. The framework exploits data from connected public traffic agents in vehicular networks without affecting user data privacy. By coordinating existing traffic infrastructure, including roadside units and road traffic clouds, the model parameters are efficiently disseminated by vehicular communications and hierarchically aggregated. Considering the individual heterogeneity of data distribution, computational and communication capabilities across traffic agents and roadside units, we employ a novel method that addresses the heterogeneity of different aggregation layers of the framework architecture, i.e., aggregation in layers of roadside units and cloud. The experiment results indicate that our method can well balance the learning accuracy and stability according to the knowledge of heterogeneity in current communication networks. Compared to other baseline approaches, the evaluation on a Non-IID MNIST dataset shows that our framework is more general and capable especially in application scenarios with low communication quality. Even when 90% of the agents are timely disconnected, the pre-trained deep learning model can still be forced to converge stably, and its accuracy can be enhanced from 68% to over 90% after convergence

    Skipped Feature Pyramid Network with Grid Anchor for Object Detection

    Full text link
    CNN-based object detection methods have achieved significant progress in recent years. The classic structures of CNNs produce pyramid-like feature maps due to the pooling or other re-scale operations. The feature maps in different levels of the feature pyramid are used to detect objects with different scales. For more accurate object detection, the highest-level feature, which has the lowest resolution and contains the strongest semantics, is up-scaled and connected with the lower-level features to enhance the semantics in the lower-level features. However, the classic mode of feature connection combines the feature of lower-level with all the features above it, which may result in semantics degradation. In this paper, we propose a skipped connection to obtain stronger semantics at each level of the feature pyramid. In our method, the lower-level feature only connects with the feature at the highest level, making it more reasonable that each level is responsible for detecting objects with fixed scales. In addition, we simplify the generation of anchor for bounding box regression, which can further improve the accuracy of object detection. The experiments on the MS COCO and Wider Face demonstrate that our method outperforms the state-of-the-art methods

    ResFed: Communication Efficient Federated Learning by Transmitting Deep Compressed Residuals

    Full text link
    Federated learning enables cooperative training among massively distributed clients by sharing their learned local model parameters. However, with increasing model size, deploying federated learning requires a large communication bandwidth, which limits its deployment in wireless networks. To address this bottleneck, we introduce a residual-based federated learning framework (ResFed), where residuals rather than model parameters are transmitted in communication networks for training. In particular, we integrate two pairs of shared predictors for the model prediction in both server-to-client and client-to-server communication. By employing a common prediction rule, both locally and globally updated models are always fully recoverable in clients and the server. We highlight that the residuals only indicate the quasi-update of a model in a single inter-round, and hence contain more dense information and have a lower entropy than the model, comparing to model weights and gradients. Based on this property, we further conduct lossy compression of the residuals by sparsification and quantization and encode them for efficient communication. The experimental evaluation shows that our ResFed needs remarkably less communication costs and achieves better accuracy by leveraging less sensitive residuals, compared to standard federated learning. For instance, to train a 4.08 MB CNN model on CIFAR-10 with 10 clients under non-independent and identically distributed (Non-IID) setting, our approach achieves a compression ratio over 700X in each communication round with minimum impact on the accuracy. To reach an accuracy of 70%, it saves around 99% of the total communication volume from 587.61 Mb to 6.79 Mb in up-streaming and to 4.61 Mb in down-streaming on average for all clients

    End-to-End Insulator String Defect Detection in a Complex Background Based on a Deep Learning Model

    Get PDF
    Normal power line insulators ensure the safe transmission of electricity. The defects of the insulator reduce the insulation, which may lead to the failure of power transmission systems. As unmanned aerial vehicles (UAVs) have developed rapidly, it is possible for workers to take and upload aerial images of insulators. Proposing a technology to detect insulator defects with high accuracy in a short time can be of great value. The existing methods suffer from complex backgrounds so that they have to locate and extract the insulators at first. Some of them make detection relative to some specific conditions such as angle, brightness, and object scale. This study aims to make end-to-end detections using aerial images of insulators, giving the locations of insulators and defects at the same time while overcoming the disadvantages mentioned above. A DEtection TRansformer (DETR) having an encoder–decoder architecture adopts convolutional neural network (CNN) as the backbone network, applies a self-attention mechanism for computing, and utilizes object queries instead of a hand-crafted process to give the direct predictions. We modified this for insulator detection in complex aerial images. Based on the dataset we constructed, our model can get 97.97 in mean average precision when setting the threshold of intersection over union at 0.5, which is better than Cascade R-CNN and YOLOv5. The inference speed of our model can reach 25 frames per second, which is qualified for actual use. Experimental results demonstrate that our model meets the robustness and accuracy requirements for insulator defect detection

    Efficient network-matrix architecture for general flow transport inspired by natural pinnate leaves

    Get PDF
    Networks embedded in three dimensional matrices are beneficial to deliver physical flows to the matrices. Leaf architectures, pervasive natural network-matrix architectures, endow leaves with high transpiration rates and low water pressure drops, providing inspiration for efficient network-matrix architectures. In this study, the network-matrix model for general flow transport inspired by natural pinnate leaves is investigated analytically. The results indicate that the optimal network structure inspired by natural pinnate leaves can greatly reduce the maximum potential drop and the total potential drop caused by the flow through the network while maximizing the total flow rate through the matrix. These results can be used to design efficient networks in network-matrix architectures for a variety of practical applications, such as tissue engineering, cell culture, photovoltaic devices and heat transfer

    Full waveform inversion based on dynamic data matching of convolutional wavefields

    Get PDF
    Cycle skipping problem caused by the absent of low frequencies and inaccurate initial model makes full waveform inversion (FWI) deviate from the true model. A novel method is proposed to mitigate cycle skipping phenomenon by dynamic data matching which improves the matching of synthetic and observed events to regulate the updating of initial model in a correct direction. 1-dimentional (1-D) Gaussian convolutional kernels with different lengths are used to extract features of each time sample in each trace which represents the integrated properties of wavefield at different time ranges centered on each time sample. According to the minimum Euclidean distance of the features, the optimally matched pairs of time samples in the observed and synthetic trace can be found. A constraint evaluates the reliability of dynamic matching by attenuating the amplitude of synthetic data according to the values of traveltime differences between each pairs of optimally matched time samples is proposed to improve the accuracy of data matching. In addition, Gaussian kernels have the capability to extract features of time samples contaminated by strong noises accurately to improve the robustness of the propose method further. The selection scheme of optimal parameters is discussed and concluded to ensure the convergence of the proposed method. Numerical tests on Marmousi model verify the feasibility of the propose method. The proposed method provides a new approach to tackle the convergence problem of FWI when using the field seismic data

    Effects of Flotage on Immersion Indentation Results of Bone Tissue: An Investigation by Finite Element Analysis

    Get PDF
    In reality, nanoindentation test is an efficient technique for probing the mechanical properties of biological tissue that soaked in the liquid media to keep the bioactivity. However, the effects of flotage imposed on the indenter will lead to inaccuracy when calculating mechanical properties (for instance, elastic modulus and hardness) by using depth-sensing nanoindentation. In this paper, the effects of flotage on the nanoindentation results of cortical bone were investigated by finite element analysis (FEA) simulation. Comparisons of nanoindentation simulation results of bone samples with and without being soaked in the liquid media were carried out. Conclusions show that the difference of load-displacement curves in the case of soaking sample and without soaking sample conditions varies widely based on the change of indentation depth. In other words, the nanoindentation measurements in liquid media will cause significant error in the calculated Young’s modules and hardness due to the flotage. By taking into account the effect of flotage, these errors are particularly important to the accurate biomechanics characterization of biological samples
    • …
    corecore